# Replication Files for How Campaign Ads Stimulate Political Interest

This repository contains replication code and data for Canen and Martin (2021), "How Campaign Ads Stimulate Political Interest."

# How to Use This Archive

1. R scripts in `data_construction` construct the analysis datasets from the raw data. These depend on proprietary data not included in the archive (see below for description). These scripts must be run in the order specified in the file name: all scripts beginning with "00_" must be run before scripts beginning with "01_", and so on. Scripts with the same number prefix may be run in any order.

2. R, Matlab, and Stata scripts in `analysis` run the DiD and RDD estimates along with related auxiliary analyses from the processed data. Processed datasets are included in the archive in the `data` folder. These scripts may be run in any order. See below for indices describing which file produces each table or figure in the manuscript and appendices. Some of the tables / figures also depend on proprietary data. These are noted in the indices.

# Manuscript Figures / Tables to Source Code Index

Table / Figure | Code to Produce | Requires Proprietary Data?
------ | ------ | ------
Figure 1(A)   | `pretrends.R` | 
Figure 1(B)   | `pretrends.R` | 
Figure 2(A)   | `pretrends.R` |
Figure 2(B)   | `ads_DiD.R` |
Figure 3(A)   | `ads_RDD.R` |
Figure 3(B)   | `ads_RDD.R` |
Figure 4(A)   | `ads_RDD.R` |
Figure 4(B)   | `ads_RDD.R` |
Table 1    | `ads_DiD.R` |
Table 2    | `ads_DiD.R` | Yes
Table 3(A) | `ads_DiD_cable_network.R` |
Table 3(B) | `ads_DiD_exposure.R` |
Table 4    | `tune_out_individual.R` | Yes


# Appendix Figures / Tables to Source Code Index

Table / Figure | Code to Produce | Requires Proprietary Data?
------ | ------ | ------
Figure A.1   | `DMA_map.m` |
Figure A.2   | `sample_individual.R` | Yes
Figure A.3   | `agg_campaign_spending2012.R` | Yes
Figure A.4   | `ads_DiD.R` | 
Figure B.1 | `balance.R` | 
Figure B.2 | `balance.R` |  
Figure B.3 | `group_overlap.R` | 
Figure B.4 | `stats_on_past_exposure.R` | 
Figure E.1 | `tune_out.R` |
Table A.1  | `ads_DiD.R` | 
Table A.2  | N/A  | 
Table A.3  | N/A  | 
Table A.4  | `balance.R` | Yes
Table C.1  | `ads_DiD.R` | 
Table C.2  | `non_news_placebo.R` |
Table C.3  | `non_news_placebo.R` |
Table C.4  | `ads_DiD.R` |
Table C.5  | `ads_DiD.R` | 
Table C.6  | `ads_DiD.R` |
Table C.7  | `ads_DiD.R` | 
Table C.8  | `ads_DiD_robust_balance.R` |
Table C.9  | `ads_DiD_robust_balance.R` |
Table C.10 | `rdd_stata.do` |
Table D.1  | `ads_DiD_robust_missing_programs.R` |

# Description of Data Files

The `data` directory contains the following files: 


File / Directory | Description
------ | ------ 
`dd` | Directory with day-by-day files with ad-level total news viewing by T/C group in 24 hour windows before and after air time
`tuneout` | Directory with day-by-day files with ad-level tuneout decisions
`ads_dd_balance_all.RData` | Ad-level statistics (coefficient estimate and p-value) for demographic balance hypotheses
`ads_dd_tuneout_all.RData` | Ad-level estimates of tune-out rate
`all_dd_data.rds` | Ad-level total news viewing by T/C group in 24 hour windows before and after air time, used to construct DiD estimates (20 min window)
`all_dd_data_5min.rds` | Ad-level total news viewing by T/C group in 24 hour windows before and after air time, used to construct DiD estimates (5 min window)
`all_dd_data_10min.rds` | Ad-level total news viewing by T/C group in 24 hour windows before and after air time, used to construct DiD estimates (10 min window)
`all_dd_data_cable_network.rds` | Ad-level total news viewing by T/C group in 24 hour windows before and after air time, used to construct DiD estimates (split by cable versus network TV)
`all_dd_data_comedy.rds` | Ad-level total comedy program viewing by T/C group in 24 hour windows before and after air time
`all_dd_data_exposure.rds` | Ad-level total political ad exposure by T/C group in 24 hour windows before and after air time
`all_dd_data_sports.rds` | Ad-level total sports program viewing by T/C group in 24 hour windows before and after air time
`all_rd_data.rds` | Device-by-ad-level news viewing in 24 hour windows before and after air time, used to construct RD estimates
`callsigns.csv` | Crosswalk from station call signs to network affiliation
`dmacodes.csv` | DMA code to name crosswalk
`DMAs_in_sample.csv` | The set of DMAs covered by the STB data
`dma_timezone.RData` | Crosswalk from DMA to time zone, for time adjustments
`final_ads.rds` | Political ads airing in DMAs covered by the STB data from 9/1/2012 to 11/6/2012
`indiv_tuneout_all.RData` | Device-by-ad-level tuneout decisions, for devices in control group for a given ad



# Sources for Proprietary Data Sets

There are three proprietary data sets used in the paper. These are:

1. Set-top Box data from FourthWall Media (FWM). This is the primary data set used to track viewing and ad exposures. These data are required to run all scripts in the `data_construction` directory, as well as the indicated scripts in the `analysis` directory. Contact `sales@fourthwallmedia.tv` to license access.

2. Ad occurrence data from Nielsen Ad Intel. These data are required to run the scripts `data_construction/00_gen_ads_data.R` and `analysis/agg_campaign_spending2012.R`. We licensed the data through the Kilts Center at the University of Chicago. Non-Kilts subscribers should contact `ethan.v.markovitz@nielsen.com` for licensing inquiries. 

3. GfK Survey of the American Consumer. These data are used in the evaluation of representativeness on demographics in Table A.4 (generated by `analysis/balance.R`). Contact `Adriane.Heimann@mrisimmons.com` for licensing inquiries.

